79 research outputs found

    Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning

    Full text link
    Intrinsically motivated spontaneous exploration is a key enabler of autonomous lifelong learning in human children. It enables the discovery and acquisition of large repertoires of skills through self-generation, self-selection, self-ordering and self-experimentation of learning goals. We present an algorithmic approach called Intrinsically Motivated Goal Exploration Processes (IMGEP) to enable similar properties of autonomous or self-supervised learning in machines. The IMGEP algorithmic architecture relies on several principles: 1) self-generation of goals, generalized as fitness functions; 2) selection of goals based on intrinsic rewards; 3) exploration with incremental goal-parameterized policy search and exploitation of the gathered data with a batch learning algorithm; 4) systematic reuse of information acquired when targeting a goal for improving towards other goals. We present a particularly efficient form of IMGEP, called Modular Population-Based IMGEP, that uses a population-based policy and an object-centered modularity in goals and mutations. We provide several implementations of this architecture and demonstrate their ability to automatically generate a learning curriculum within several experimental setups including a real humanoid robot that can explore multiple spaces of goals with several hundred continuous dimensions. While no particular target goal is provided to the system, this curriculum allows the discovery of skills that act as stepping stone for learning more complex skills, e.g. nested tool use. We show that learning diverse spaces of goals with intrinsic motivations is more efficient for learning complex skills than only trying to directly learn these complex skills

    Unsupervised Learning of Goal Spaces for Intrinsically Motivated Goal Exploration

    Get PDF
    Intrinsically motivated goal exploration algorithms enable machines to discover repertoires of policies that produce a diversity of effects in complex environments. These exploration algorithms have been shown to allow real world robots to acquire skills such as tool use in high-dimensional continuous state and action spaces. However, they have so far assumed that self-generated goals are sampled in a specifically engineered feature space, limiting their autonomy. In this work, we propose to use deep representation learning algorithms to learn an adequate goal space. This is a developmental 2-stage approach: first, in a perceptual learning stage, deep learning algorithms use passive raw sensor observations of world changes to learn a corresponding latent space; then goal exploration happens in a second stage by sampling goals in this latent space. We present experiments where a simulated robot arm interacts with an object, and we show that exploration algorithms using such learned representations can match the performance obtained using engineered representations

    Extraction de détecteurs d'objets urbains à partir d'une ontologie

    Get PDF
    National audienceAfin de parvenir à une méthode d'interprétation automatique d'im-ages de télédétection à trÚs haute résolution spatiale, il est nécessaire d'exploiter autant que possible les connaissances du domaine. Pour détecter différents types d'objet comme la route ou le bùti, des méthodes trÚs spécifiques ont été dévelop-pées pour obtenir de trÚs bons résultats. Ces méthodes utilisent des connais-sances du domaine sans les formaliser. Dans cet article, nous proposons tout d'abord de modéliser la connaissance du domaine de maniÚre explicite au sein d'une ontologie. Ensuite, nous introduisons un algorithme pour construire des détecteurs spécifiques utilisant les connaissances de cette ontologie. La sépara-tion nette entre modélisation des connaissances et construction des détecteurs rend plus lisible le processus d'interprétation. Ce découplage permet également d'utiliser l'algorithme de construction de détecteurs dans un autre domaine d'ap-plication, ou de modifier l'algorithme de construction de détecteurs sans modi-fier l'ontologie

    Modular Active Curiosity-Driven Discovery of Tool Use

    Get PDF
    International audienceThis article studies algorithms used by a learner to explore high-dimensional structured sensorimotor spaces such as in tool use discovery. In particular, we consider goal babbling architectures that were designed to explore and learn solutions to fields of sensorimotor problems, i.e. to acquire inverse models mapping a space of parameterized sensorimotor problems/effects to a corresponding space of parameterized motor primitives. However, so far these architectures have not been used in high-dimensional spaces of effects. Here, we show the limits of existing goal babbling architectures for efficient exploration in such spaces, and introduce a novel exploration architecture called Model Babbling (MB). MB exploits efficiently a modular representation of the space of parameterized problems/effects. We also study an active version of Model Babbling (the MACOB architecture). These architectures are compared in a simulated experimental setup with an arm that can discover and learn how to move objects using two tools with different properties, embedding structured high-dimensional continuous motor and sensory spaces

    Towards hierarchical curiosity-driven exploration of sensorimotor models

    Get PDF
    International audienceCuriosity-driven exploration mechanisms have been proposed to allow robots to actively explore high dimensional sensorimotor spaces in an open-ended manner [1], [2]. In such setups, competence-based intrinsic motivations show better results than knowledge-based exploration mechanisms which only monitor the learner's prediction performance [2], [3]. With competence-based intrinsic motivations, the learner explores its sensor space with a bias toward regions which are predicted to yield a high competence progress. Also, throughout its life, a developmental robot has to incrementally explore skills that add up to the hierarchy of previously learned skills, with a constraint being the cost of experimentation. Thus, a hierarchical exploration architecture could allow to reuse the sensorimotor models previously explored and to combine them to explore more efficiently new complex sensorimotor models. Here, we rely more specifically on the R-IAC and SAGG-RIAC series of architectures [3]. These architectures allow the learning of a single mapping between a motor and a sensor space with a competence-based intrinsic motivation. We describe some ways to extend these architectures with different tasks spaces that can be explored in a hierarchical manner, and mechanisms to handle this hierarchy of sensorimotor models that all need to be explored with an adequate amount of trials. We also describe an interactive task to evaluate the hierarchical learning mechanisms, where a robot has to explore its motor space in order to push an object to different locations. The robot can first explore how to make movements with its hand and then reuse this skill to explore the task of pushing an object

    Overlapping Waves in Tool Use Development: a Curiosity-Driven Computational Model

    Get PDF
    International audienceThe development of tool use in children is a keyquestion for the understanding of the development of humancognition. Several authors have studied it to investigate howchildren explore, evaluate and select alternative strategies forsolving problems. In particular, Siegler has used this domainto develop the overlapping waves theory that characterizes howinfants continue to explore alternative strategies to solve a par-ticular problem, even when one is currently better than others.In computational models of strategy selection for the problemof integer addition, Shrager and Siegler proposed a mechanismthat maintains the concurrent exploration of alternative strategieswith use frequencies that are proportional to their performancefor solving a particular problem. This mechanism was also usedby Chen and Siegler to interpret an experiment with 1.5- and2.5-year-olds that had to retrieve an out-of-reach toy, and wherethey could use one of several available strategies that includedleaning forward to grasp a toy with the hand or using a toolto retrieve the toy. In this paper, we use this domain of tooluse discovery to consider other mechanisms of strategy selectionand evaluation. In particular, we present models of curiosity-driven exploration, where strategies are explored according tothe learning progress/information gain they provide (as opposedto their current efficiency to actually solve the problem). In thesemodels, we define a curiosity-driven agent learning a hierarchyof different sensorimotor models in a simple 2D setup with arobotic arm, a stick and a toy. In a first phase, the agent learnsfrom scratch how to use its robotic arm to control the tool andto catch the toy, and in a second phase with the same learningmechanisms, the agent has to solve three problems where the toycan only be reached with the tool. We show that agents choosingstrategies based on a learning progress measure also displayoverlapping waves of behavior compatible with the one observedin infants, and we suggest that curiosity-driven exploration couldbe at play in Chen and Siegler’s experiment, and more generallyin tool use discovery

    Curiosity-Driven Development of Tool Use Precursors: a Computational Model

    Get PDF
    International audienceStudies of child development of tool use precursors show successive but overlapping phases of qualitatively different types of behaviours. We hypothesize that two mechanisms in particular play a role in the structuring of these phases: the intrinsic motivation to explore and the representation used to encode sensorimotor experience. Previous models showed how curiosity-driven learning mechanisms could allow the emergence of developmental trajectories. We build upon those models and present the HACOB (Hierarchical Active Curiosity-driven mOdel Babbling) architecture that actively chooses which sensorimotor model to train in a hierarchy of models representing the environmental structure. We study this architecture using a simulated robotic arm interacting with objects in a 2D environment. We show that overlapping phases of behaviours are autonomously emerging in hierarchical models using active model babbling. To our knowledge, this is the first model of curiosity-driven development of simple tool use and of the self-organization of overlapping phases of behaviours. In particular, our model explains why and how in-trinsically motivated exploration of non-optimal methods to solve certain sensorimotor problems can be useful to discover how to solve other sensorimotor problems, in accordance with Siegler's overlapping waves theory, by scaffolding the learning of increasingly complex affordances in the environment

    Trace element accumulation in Mn—Fe—oxide nodules of a planosolic horizon.

    No full text
    The aim of this work was to determine the importance of nodule formation on the dynamics of major and trace elements (TEs) along a Planosol toposequence developed in metamorphic parent material at La ChĂątre (Massif Central, France). The different horizons were sampled within three pits and analysed for major and trace element contents. The nodule-rich horizon was studied more closely. A simplified sequential extraction scheme, X-ray diffraction (XRD) and microscopic approaches were used in order to determine the individual phases containing TE in nodules. Along the slope, the nodule-rich horizon varies in thickness, is composed of different oxide fractions and has different scavenging efficiencies according to the TE considered. Iron was found to accumulate in the middle of the slope, while Mn accumulated at the base. The scavenging effect is only evident for Ni in profile 1. For Fe and Cu, it is maximal in profile 2 where the nodule-rich horizon is the thickest. For Pb and Mn, maximal scavenging effect is recorded for both profiles 2 and 3, in the lowest part of the slope. Cr is not accumulated at all. This was related to the water dynamic and the hydromorphic conditions prevailing along the slope. Results obtained by sequential extractions and associated X-ray diffraction on the different nodule size fractions and those obtained by electron microprobe allow inference of the TE distribution in nodules. Nodules were mainly composed of three to four types of cements surrounding grains of quartz, feldspars, micas and accessory minerals: iron-rich cements, Si- and Al-rich cements, Mn-rich cements and Ti-rich cements in places. The iron-rich cements consist of poorly crystalline goethite and possibly some ferrihydrite. Ferrihydrite is associated with Cr as demonstrated by extractions. Goethite contained Mn and most of the TE extracted except for Ni and Pb. Fine-grained Si- and Al-rich cements were also observed. They contain variable amounts of Ti and Mn. Mn-rich cements were not present in all the nodules and were mainly linked to the dark zones of the nodules. The nature of these Mn oxides could not be determined. They were found to contain Co, Ni, Cu and probably Pb

    Autonomous exploration, active learning and human guidance with open-source Poppy humanoid robot platform and Explauto library

    Get PDF
    International audienceOur demonstration presents an open-source hardware and software platform which allows non-roboticistsresearchers to conduct machine learning experiments to benchmark algorithms for autonomous explorationand active learning. In particular, in addition to showing the general properties of the platform such asits modularity and usability, we will demonstrate the online functioning of a particular algorithm whichallows efficient learning of multiple forward and inverse models and can leverage information from humanguidance. A first aspect of our demonstration is to illustrate the ease of use of the 3D printed low-costPoppy humanoid robotic platform, that allows non-roboticists to quickly set up and program roboticexperiments. A second aspect is to show how the Explauto library allows systematic comparison andevaluation of active learning and exploration algorithms in sensorimotor spaces, through a Python API toselect already implemented exploration algorithms. The third idea is to showcase Active Model Babbling,an efficient exploration algorithm dynamically choosing which task/goal space to explore and particulargoals to reach, and integrating social guidance from humans in real time to drive exploration towardsparticular objects or actions.[Forestier and Oudeyer, 2016] Forestier, S. and Oudeyer, P.-Y. (2016). Modular active curiosity-driven discovery oftool use. In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Korea.[Lapeyre et al., 2014] Lapeyre, M., Rouanet, P., Grizou, J., Nguyen, S., Depraetre, F., Le Falher, A., and Oudeyer,P.-Y. (2014). Poppy Project: Open-Source Fabrication of 3D Printed Humanoid Robot for Science, Educationand Art. In Digital Intelligence 2014, page 6, Nantes, France.[Moulin-Frier et al., 2014] Moulin-Frier, C., Rouanet, P., Oudeyer, P.-Y., and others (2014). Explauto: an open-source Python library to study autonomous exploration in developmental robotics. In ICDL-Epirob-InternationalConference on Development and Learning, Epirob
    • 

    corecore